30 research outputs found

    Chromatin accessibility reveals insights into androgen receptor activation and transcriptional specificity

    Get PDF
    BACKGROUND: Epigenetic mechanisms such as chromatin accessibility impact transcription factor binding to DNA and transcriptional specificity. The androgen receptor (AR), a master regulator of the male phenotype and prostate cancer pathogenesis, acts primarily through ligand-activated transcription of target genes. Although several determinants of AR transcriptional specificity have been elucidated, our understanding of the interplay between chromatin accessibility and AR function remains incomplete. RESULTS: We used deep sequencing to assess chromatin structure via DNase I hypersensitivity and mRNA abundance, and paired these datasets with three independent AR ChIP-seq datasets. Our analysis revealed qualitative and quantitative differences in chromatin accessibility that corresponded to both AR binding and an enrichment of motifs for potential collaborating factors, one of which was identified as SP1. These quantitative differences were significantly associated with AR-regulated mRNA transcription across the genome. Base-pair resolution of the DNase I cleavage profile revealed three distinct footprinting patterns associated with the AR-DNA interaction, suggesting multiple modes of AR interaction with the genome. CONCLUSIONS: In contrast with other DNA-binding factors, AR binding to the genome does not only target regions that are accessible to DNase I cleavage prior to hormone induction. AR binding is invariably associated with an increase in chromatin accessibility and, consequently, changes in gene expression. Furthermore, we present the first in vivo evidence that a significant fraction of AR binds only to half of the full AR DNA motif. These findings indicate a dynamic quantitative relationship between chromatin structure and AR-DNA binding that impacts AR transcriptional specificity

    A hidden Markov random field-based Bayesian method for the detection of long-range chromosomal interactions in Hi-C data

    Get PDF
    Motivation: Advances in chromosome conformation capture and next-generation sequencing technologies are enabling genome-wide investigation of dynamic chromatin interactions. For example, Hi-C experiments generate genome-wide contact frequencies between pairs of loci by sequencing DNA segments ligated from loci in close spatial proximity. One essential task in such studies is peak calling, that is, detecting non-random interactions between loci from the two-dimensional contact frequency matrix. Successful fulfillment of this task has many important implications including identifying long-range interactions that assist interpreting a sizable fraction of the results from genome-wide association studies. The task - distinguishing biologically meaningful chromatin interactions from massive numbers of random interactions - poses great challenges both statistically and computationally. Model-based methods to address this challenge are still lacking. In particular, no statistical model exists that takes the underlying dependency structure into consideration. Results: In this paper, we propose a hidden Markov random field (HMRF) based Bayesian method to rigorously model interaction probabilities in the two-dimensional space based on the contact frequency matrix. By borrowing information from neighboring loci pairs, our method demonstrates superior reproducibility and statistical power in both simulation studies and real data analysis

    Evolutionary Computation for Optimal Ensemble Classifier in Lymphoma Cancer Classification

    Full text link
    Abstract. Owing to the development of DNA microarray technologies, it is possible to get thousands of expression levels of genes at once. If we make the effective classification system with such acquired data, we can predict the class of new sample, whether it is normal or patient. For the classification system, we can use many feature selection methods and classifiers, but a method cannot be superior to the others absolutely for feature selection or classification. Ensemble classifier has been using to yield improved performance in this situation, but it is almost impossible to get all ensemble results, if there are many feature selection methods and classifiers to be used for ensemble. In this paper, we propose GA based method for searching optimal ensemble of feature-classifier pairs on Lymphoma cancer dataset. We have used two ensemble methods, and GA finds optimal ensemble very efficiently.

    Adipose Tissue Gene Expression Associations Reveal Hundreds of Candidate Genes for Cardiometabolic Traits

    Get PDF
    Genome-wide association studies (GWASs) have identified thousands of genetic loci associated with cardiometabolic traits including type 2 diabetes (T2D), lipid levels, body fat distribution, and adiposity, although most causal genes remain unknown. We used subcutaneous adipose tissue RNA-seq data from 434 Finnish men from the METSIM study to identify 9,687 primary and 2,785 secondary cis-expression quantitative trait loci (eQTL; <1 Mb from TSS, FDR < 1%). Compared to primary eQTL signals, secondary eQTL signals were located further from transcription start sites, had smaller effect sizes, and were less enriched in adipose tissue regulatory elements compared to primary signals. Among 2,843 cardiometabolic GWAS signals, 262 colocalized by LD and conditional analysis with 318 transcripts as primary and conditionally distinct secondary cis-eQTLs, including some across ancestries. Of cardiometabolic traits examined for adipose tissue eQTL colocalizations, waist-hip ratio (WHR) and circulating lipid traits had the highest percentage of colocalized eQTLs (15% and 14%, respectively). Among alleles associated with increased cardiometabolic GWAS risk, approximately half (53%) were associated with decreased gene expression level. Mediation analyses of colocalized genes and cardiometabolic traits within the 434 individuals provided further evidence that gene expression influences variant-trait associations. These results identify hundreds of candidate genes that may act in adipose tissue to influence cardiometabolic traits. © 2019 American Society of Human Genetic

    A Novel EPA-KNN Gene Classification Algorithm

    No full text

    Linear Multi-class Classification Support Vector Machine

    No full text

    Combining support vector machines and the t-statistic for gene selection in DNA microarray data analysis

    Full text link
    This paper proposes a new gene selection (or feature selection) method for DNA microarray data analysis. In the method, the t-statistic and support vector machines are combined efficiently. The resulting gene selection method uses both the data intrinsic information and learning algorithm performance to measure the relevance of a gene in a DNA microarray. We explain why and how the proposed method works well. The experimental results on two benchmarking microarray data sets show that the proposed method is competitive with previous methods. The proposed method can also be used for other feature selection problems. © 2010 Springer-Verlag Berlin Heidelberg

    Automatic Choice of Control Measurements

    No full text
    corecore